Skip to content

introduce a SystemEnv class to unify environment variable use and enable smooth transition#311

Open
MoralCode wants to merge 18 commits into
mainfrom
feat/flexible_env
Open

introduce a SystemEnv class to unify environment variable use and enable smooth transition#311
MoralCode wants to merge 18 commits into
mainfrom
feat/flexible_env

Conversation

@MoralCode
Copy link
Copy Markdown
Contributor

@MoralCode MoralCode commented May 8, 2026

Description
This PR introduces a new SystemEnv class that lives alongside SystemConfig and aims to centralize/unify all environment variable access.

The primary purpose of this is to slightly decouple the requested environment variable name from the name where the value is actually pulled from. For example, even if all in-code references use COLLECTOSS_DB for the database connection string, this layer will allow a values to be read from AUGUR_DB as a fallback if COLLECTOSS_DB (preferred) is not set.

This class is also set up to print a deprecation warning/notice whenever the AUGUR variant of an environment variable is used as a fallback. The message makes it clear that this fallback behavior is only temporary and may not be around forever (although I don't see this code being removed anytime in the foreseeable future)

This PR includes a unit tests for this new functionality and works as intended, including edge cases such as when the key the code requests doesn't use a known prefix (os.getenv("CONFIG_DATADIR"))

It also includes a special case function get_bool() because there were ~3 different places in the code where a small python one-liner (that i wrote) had been copied around for convenience, and this was a good opportunity to refactor that to prevent those three versions from ever drifting in implementation from each other. There are certainly more places this logic can be reused as well as we unify the way all env vars are processed.

This PR resolves one of the large items that needs to be done to complete #279 for the next release.

Notes for Reviewers
This change includes unit tests for "backwards" variable access that keep the code able to refer to AUGUR_ variables so that those can be renamed later

This PR does not actually change any environment variable names, but that can be done either in this PR or a followup depending on the discussion results

TODOs:

  • Not yet run to full collection or tested much beyond the unit tests - this will need to be done before merge (ideally by multiple maintainers)
  • Environment variables handled via Click (the CLI library) are not yet addressed. I suspect this will be done via some kind of function that instructs SystemEnv to iterate through all known env vars, detect the ones with a known deprecated prefix, and copy the values of those variables to new keys with the preferred prefix that click is expecting.

Signed commits

  • Yes, I signed my commits.

MoralCode added 18 commits May 8, 2026 16:30
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
…ode slowly over time

Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
@MoralCode MoralCode requested a review from shlokgilda May 8, 2026 20:52
prefixes = ["COLLECTOSS", "OTHER"]

def test_env_extract_prefix():
assert extract_prefix("OTHER_DB", prefixes) == "OTHER_"

def test_env_extract_prefix():
assert extract_prefix("OTHER_DB", prefixes) == "OTHER_"
assert extract_prefix("COLLECTOSS_DB", prefixes) == "COLLECTOSS_"
assert extract_prefix("COLLECTOSS_DB", prefixes) == "COLLECTOSS_"

def test_env_extract_prefix_default():
assert extract_prefix("SOME_DB", prefixes) is None

def test_env_extract_prefix_default():
assert extract_prefix("SOME_DB", prefixes) is None
assert extract_prefix("THINGY_DB", prefixes) is None


def test_env_extract_prefix_unprefixed():
assert extract_prefix("DB", prefixes) is None
assert SystemEnv.get("COLLECTOSS_MISSING", None, prefixes) is None

def test_fetching_env_default():
assert SystemEnv.get("COLLECTOSS_DEFAULT", "SOME", prefixes) == "SOME"
def test_no_known_prefix():
# fallback handling
os.environ["THING"] = "C"
assert SystemEnv.get("THING", None, prefixes) == "C"

for case in cases:
os.environ["OTHER_BOOL"] = case
assert SystemEnv.get_bool("OTHER_BOOL", False, prefixes) == True

for case in cases:
os.environ["OTHER_BOOL"] = case
assert SystemEnv.get_bool("OTHER_BOOL", True, prefixes) == False

for case in cases:
os.environ["OTHER_BOOL"] = case
assert SystemEnv.get_bool("OTHER_BOOL", False, prefixes) == False
@@ -1,13 +1,11 @@
import os
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0611: Unused import os (unused-import)

@@ -130,7 +131,7 @@ def start(ctx, disable_collection, development, pidfile, port):
processes = start_celery_worker_processes((core_worker_count, secondary_worker_count, facade_worker_count), disable_collection)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0621: Redefining name 'processes' from outer scope (line 396) (redefined-outer-name)

@@ -15,6 +15,7 @@
import requests
from redis.exceptions import ConnectionError as RedisConnectionError
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0611: Unused ConnectionError imported from redis.exceptions as RedisConnectionError (unused-import)

@@ -31,7 +32,7 @@

from keyman.KeyClient import KeyClient, KeyPublisher
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0611: Unused KeyClient imported from keyman.KeyClient (unused-import)

@@ -2,6 +2,7 @@
import os
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0611: Unused import os (unused-import)

@@ -2,6 +2,7 @@
import os
from collectoss.application.db.models import *
from collectoss.application.db.lib import bulk_insert_dicts, get_repo_by_repo_git, get_value, get_session
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0611: Unused get_session imported from collectoss.application.db.lib (unused-import)

@@ -2,6 +2,7 @@
import os
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0611: Unused import os (unused-import)

tasks = start_tasks + github_tasks + gitlab_tasks + git_tasks + materialized_view_tasks + frontend_tasks

if os.environ.get('AUGUR_DOCKER_DEPLOY') != "1":
if SystemEnv.get('AUGUR_DOCKER_DEPLOY') != "1":
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
E0602: Undefined variable 'SystemEnv' (undefined-variable)

@MoralCode MoralCode added this to the v1.1 Migration Release milestone May 8, 2026
Copy link
Copy Markdown

@shlokgilda shlokgilda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some inline comments

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is missing the SystemEnv import.

if k.startswith(p):
prefix_len += len(p)

if k[prefix_len] == separator:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two bugs in this block, easier to fix together:

  • k[prefix_len] will raise an error (likely IndexError) when the key is exactly the prefix (e.g. extract_prefix("AUGUR", ["AUGUR"])).
  • The return key[0:prefix_len] is unconditional once startswith matches, so extract_prefix("AUGURDB", ["AUGUR"]) returns "AUGUR" even though AUGURDB isn't actually prefixed. That then makes SystemEnv.get("AUGURDB") look up COLLECTOSS_DB/AUGUR_DB, which I think is wrong?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catches!

Comment on lines +19 to +29
prefix_len = 0
for p in prefixes:
p = p.upper()
k = key.upper()
if k.startswith(p):
prefix_len += len(p)

if k[prefix_len] == separator:
prefix_len += len(separator)
return key[0:prefix_len]
return None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
prefix_len = 0
for p in prefixes:
p = p.upper()
k = key.upper()
if k.startswith(p):
prefix_len += len(p)
if k[prefix_len] == separator:
prefix_len += len(separator)
return key[0:prefix_len]
return None
k = key.upper()
for p in prefixes:
p_up = p.upper()
if k == p_up:
return key[:len(p)]
if k.startswith(p_up + separator):
return key[:len(p) + len(separator)]
return None

Should probably also update the docstrings and and worth adding tests for both: extract_prefix("AUGUR", ...) and extract_prefix("AUGURDB", ...)?

msg = (
f"Environment variable '{check_key}' is deprecated. "
f"Use '{key}' instead. This automatic recovery may become a failure in a future version "
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this deprecation message telling the user to use the variable they already used? When call sites are still SystemEnv.get("AUGUR_DB"), key is AUGUR_DB, so the warning reads: "Environment variable 'AUGUR_DB' is deprecated. Use 'AUGUR_DB' instead." Should point at the canonical (non-warn) name?

Comment on lines +72 to +73
raw_val = cls.get(key, None, prefixes)
return raw_val.lower() in ('true', '1', 't', 'y', 'yes') if raw_val else default
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if raw_val is falsy on an empty string, so FOO="" returns default instead of False. Compared to the old os.getenv("FOO", "False").lower() in (...) pattern, it actually flips outcomes for the cases where default=True.

Suggested change
raw_val = cls.get(key, None, prefixes)
return raw_val.lower() in ('true', '1', 't', 'y', 'yes') if raw_val else default
raw_val = cls.get(key, None, prefixes)
if raw_val is None:
return default
return raw_val.lower() in ('true', '1', 't', 'y', 'yes')

return raw_val.lower() in ('true', '1', 't', 'y', 'yes') if raw_val else default

@classmethod
def set(cls, key: str, value: str, overwrite=True) -> None:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set writes the literal key, but get does prefix translation. so SystemEnv.set("AUGUR_DEV", "1") in backend.py plants an AUGUR_DEV that the next SystemEnv.get(...) happily resolves via fallback and warns about. The codebase ends up emitting deprecation warnings against itself. Similar to the deprecation warning issue earlier in get method. Let's have set rewrite to the canonical prefix (so writes always land on COLLECTOSS_*)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional, the next step of this PR, ideally before the next release, is to change all the env vars to use the COLLECTOSS_ prefix

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is a misguided attempt to not do too much in one PR lol


#this is path where our scorecard project is located
path_to_scorecard = os.getenv('SCORECARD_DIR', os.environ['HOME'] + '/scorecard')
path_to_scorecard = SystemEnv.get('SCORECARD_DIR', (SystemEnv.get('HOME') or "~") + '/scorecard')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be:

Suggested change
path_to_scorecard = SystemEnv.get('SCORECARD_DIR', (SystemEnv.get('HOME') or "~") + '/scorecard')
path_to_scorecard = SystemEnv.get('SCORECARD_DIR', os.path.expanduser('~/scorecard'))

If HOME is unset, the fallback becomes the literal string ~/scorecard. subprocess's cwd= won't expand ~ (that's a shell thing), so this silently hands a broken path downstream. The old os.environ['HOME'] would KeyError loudly. Wouldn't that be better?

Same thing applies to SCC_DIR in collectoss/tasks/git/scc_value_tasks/core.py:24.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is .expanduser() influenced by HOME?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants